Type 2 diabetes susceptibility genes

ABSTRACT

Two of the genetic bases for susceptibility to type 2 diabetes are disclosed. The alleles of the genes SorCS1 and SorCS3 that a person carries can determine whether or not that person is susceptible to type 2 diabetes.

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims priority from U.S. provisional patent application Serial No. 60/409,525 filed Sep. 9, 2002.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002] To be determined.

BACKGROUND OF THE INVENTION

[0003] Type 2 diabetes is also called non-insulin dependent diabetes mellitus (NIDDM) or adult onset diabetes. Over 90% of diabetes is of the type 2 kind. The American Diabetes Association reports that there are 12 million Americans with type 2 diabetes and another 7 million potential candidates. An annual expenditure of $100 billion is attributed to the disease. It is the third leading cause of death at 62,000 each year. Prolonged untreated diabetes leads to heart diseases, stroke, kidney disease, blindness, and loss of limbs from advanced peripheral vascular disease.

[0004] Type 2 diabetes involves insulin resistance coupled with failure of the pancreatic β-cells to secret enough insulin to maintain euglycemia (1-3). Although insulin resistance is a feature of type 2 diabetes, an individual can be severely insulin resistant without ever exhibiting fasting hyperglycemia; β-cell insufficiency is an essential feature of type 2 diabetes (4). The question becomes why do some people become severely insulin dependent without developing type 2 diabetes while other people do develop the disease. A logical question becomes whether or not a genetic predisposition to the disease exists.

[0005] Obesity is an important independent risk factor for the development of type 2 diabetes: more than 80% of type 2 diabetic patients are obese. Nevertheless, although most obese people are insulin resistant, the majority remains euglycemic. Currently, there are few tools available to help predict which obese individual will progress to type 2 diabetes. Again the question is why some individual are obese and insulin resistant, but not diabetic, while others develop the disease.

[0006] Type 2 diabetes does tend to run within families and ethnic groups suggesting a strong genetic contribution to the disease (5). However, the major type 2 diabetes susceptibility genes were heretofore unknown. Identification of susceptibility genes for type 2 diabetes will provide screening tools for identifying individuals who are susceptible to the disease and related diseases so that they can take prophylactic measures. In addition, it can also lead to the development of new prevention and treatment tools for the disease. These tools are used to identify therapeutic agents for the treatment of the disease and related diseases.

SUMMARY OF THE INVENTION

[0007] The present invention is summarized in that a method of assessing whether a human subject is susceptible to type 2 diabetes is based on the step of determining the allele in the genome of that subject of the SorCS 1 gene or the SorCS3 gene.

[0008] It is a feature of the present invention that one of the genetic bases for susceptibility to type 2 diabetes has been identified.

[0009] It is an object of the present invention to enable genetic tests to determine if individuals have a genetic susceptibility to type 2 diabetes arising from the allele of the SorCS 1 gene or the SorCS3 gene carried by that individual.

[0010] Other objects advantages and features of the present invention will become apparent from the following specification.

BRIEF DESCRIPTION OF THE DRAWING FIGURES

[0011]FIG. 1 is a genetic map of a region on mouse chromosome 19 in which the genetic element responsible for susceptibility to type 2 diabetes was found.

[0012]FIG. 2 is a best fit genetic comparison of the amino acid sequences of human and mouse SorCS 1 proteins.

DESCRIPTION OF THE INVENTION

[0013] It is taught here that a mammalian gene known as SorCS 1 genes is one of the genetic elements which can make a person susceptible to type 2 diabetes. An alteration to the human SorCS1 gene makes an individual susceptible to developing type 2 diabetes. The mutant form of the gene does not cause type 2 diabetes, there must still be the conditions that lead to insulin insensitivity, such as obesity. The identification of this gene as a contributor to susceptibility to type 2 diabetes begins to answer the questions about why some people develop type 2 diabetes while others do not.

[0014] A similar indication has been found about a related gene known as SorCS3. Alterations in the gene and resultant protein for the SorCS3 locus are also indicators of susceptibility for type 2 diabetes in humans.

[0015] The identification of thSorCS1 gene as a type 2 diabetes susceptibility gene was worked out in two congenic mice strains, which have a SorCS1 gene directly analogous to the human gene. In summary, two groups of obese mice were identified, a first group which was would develop a severe from of type 2 diabetes and a second group which proved to develop a less severe form of type 2 diabetes. By breeding and genetic testing, the source of the genetic difference between the two groups of mice was identified. Two loci were mapped that determined diabetes susceptibility. One loci was on chromosome 16, where the diabetes-associated allele comes from a diabetes-susceptible mouse strain BTBR, which would develop only the less severe form of diabetes. The second locus was found to be located on chromosome 19, and this allele, carried in a mouse strain B6, was associated with the more severe form of type 2 diabetes. The phenomenon by which a disease trait is transmitted from the unaffected parent to its offspring is termed “transgression.” The strongest data comes from congenic mice where BTBR obese mice are diabetic and the severity of their diabetes in much greater if they inherit a 7 Mb segment of chromosome 19 from a B6 parent. The mice exhibited very high levels of plasma glucose, averaging 120 mg/dl, more than the glucose level of BTBR obese mice. It was ultimately determined that the more severely diabetic mice have an allele of the SorCS 1 protein (the B6 allele) that is three amino acids different from the allele of that same protein in the BTBR mice which would not develop type 2 diabetes to the same degree of severity. In other words, the difference in susceptibility to severe diabetes resolved down to differences in the allele of the gene for SorCs1. Since the same phenomenon exists in humans, and the analogous SorCS1 gene is found in humans, the same variation in severity of diabetes is found in humans. Thus it is now possible to perform genetic tests of human individuals and determine if the patient is genetically susceptible to severe type 2 diabetes due to his or her allele of the SorCS1 gene. Note that this gene is one of the sources of genetic susceptibility to type 2 diabetes, but it may or may not be the source of all such susceptibility. It is possible that there are other genes which contribute to the genetic susceptibility to this disease. What can be said here is that this gene is at least one of the sources of genetic susceptibility to type 2 diabetes, and that allelic differences in this gene are alone sufficient so explain some of the genetic susceptibility to the disease.

[0016] The identification of this two gene as a trait for susceptibility to severe type 2 diabetes suggests new diagnostic, prevention and treatment tools for type 2 diabetes and related diseases. Related diseases include those diseases and conditions which are treated or ameliorated by modulation of SorCS1 activity or expression. These diseases and conditions include type 1 diabetes, and other disorders relating to glucose metabolism, insulin secretion, insulin degradation, vesicle transport in secretory cells, pancreas and hepatocyte activity, dyslipidemia and obesity.

[0017] As described in the example below, inventors began by narrowing down the genetic region associated with a genetic cause of severe type 2 diabetes to a 7 Mb segment of mouse chromosome 19, to allow the identification of genes that are associated with the severe form of type 2 diabetes. Two genes previously were found in that region, SorCS1 and SorCS3. SorCS 1 and SorCS 3 belong to the sortilin gene family which include sortilin, SorCS1, SorCS2, SorCS 3 and sorLA. The genes in this family share a large region of similarity including the VPS10 domain. Sortilin is located in vesicles in muscles and adipose tissue that contain glut4. Glut4 is the insulin sensitive glucose transporter that is shuttled to the cell surface upon insulin stimulation to enable cells to import glucose at a higher rate. Sortilin also binds to lipoprotein lipase and a neuropeptide called neurotensin through the VPS10 domain. Thus, SorCS3 and SorCS1 are expected to be involved in insulin-stimulated glucose transportation and in controlling body fat metabolism. To verify which of these genetic elements was responsible for the difference in susceptibility to diabetes required characterization of the genes and the alleles present in those gene.

[0018] The inventors thus proceeded to characterize the genes and sequences in the 7 Mb region. It was discovered that for each of the genes present, the alleles of the genes carried by the most severely diabetic mice was the same as the alleles of the genes carried by the less severely affected mice, with the sole exception of the allele of the SorCS1 gene. FIG. 1 illustrates a genetic map of the genetic elements found in the 7 Mb region associated with the genetic difference. The region between map units 55 and 48 carried the genetic difference. The alleles of the SorCS3 gene turned out to be identical in the two strains of mice. As illustrated in Table 1 below, however, the susceptible mice had an allele of the SorCS1 gene that is three nucleotides different from that of the less severely diabetic mice. The resulting protein is also three amino acids different. This difference results in a genetic susceptibility to type 2 diabetes. TABLE 1 SorCS1 mutations altering amino acids Nucleotide Amino Acid position position in cDNA B6 BTBR in protein B6 BTBR isoform(s)  172 C T  50 Thr Ile a, b, c 3433 C T 1139 Ser Phe a 3462 T C 1149 Ser Pro c

[0019] The genomic and cDNA sequences of human SorCS1 is known. The human SorCS1 cDNA sequence (GenBank Accession No. NM_(—)052918) and SorCS1 amino acid sequence (GenBank Accession No. NP_(—)443150) are incorporated herein as SEQ ID NO.1-2 respectively. Also, shown in the sequence attachment hereto are the amino acid sequences of mouse (SorCS1a, SorCS1b, SorCS1c) and human (SorCS1a, SorCS1b, and SorCS1c). Also shown in FIG. 2 is an amino acid sequence alignment between mouse SorCS1b (mSorCS1b) and human SorCs1 (hSorcs1). Note that the sequences are highly homologous, in fact have a sequence identity of 93%. It is this degree of identity that provides the rational for the prediction that the genetic evidence from the congenic mouse model presented here does, in fact, predict the same genetic phenomenon in humans.

[0020] From a diagnostic perspective, individual human beings can be examined for the allele of their SorCS1 gene as a step in determining whether they are susceptible to developing type 2 diabetes. For example, the SorCS1 cDNA sequence of an individual can be determined and the deducted amino acid sequence can be compared to SEQ ID NO:2. If a mutation at the amino acid level is detected, especially if the mutation is one other than a conservative substitution, the individual can be identified as susceptible to developing type 2 diabetes.

[0021] We have also discovered a similar allelic difference associated with susceptibility to diabetes, in a highly related gene. We have detected a C-to-A mutation in the SorCS3 gene which is found in a Bedouin Arab family in which type 2 diabetes has a high occurrence rate. The mutation results in a Serine to Arginine mutation at amino acid 790 of the human SorCS3 amino acid sequence (SEQ ID NO:4). While it is thus known that this mutation can permit the development of type 2 diabetes, there are certainly other mutations of these genes which cause the same susceptibility. The discovery of this mutation lends support to the concept that both of the related genes SorCS1 and SorCS3 can be the source of genetic susceptibility to type 2 diabetes.

[0022] Susceptibility may also be determined by measuring the mRNA or protein level of SorCS1 or SorCS3. A lack of expression of the proper form of SorCS1, SorCS3, or both, at either the mRNA level or the protein level indicates susceptibility to developing type 2 diabetes. The expression level can be compared to the normal range of level of expression and a expression level than the normal range indicates susceptibility to developing type 2 diabetes. The normal range of level of expression can be established by measuring the expression level in a suitable number of type 2 diabetes-free individuals. Given that the cDNA and amino acid sequences of both SorCS1 and SorCS3 are known, one of ordinary skill can readily design probes and primers and generate antibodies to practice the method described above.

[0023] Diagnostic analysis of the SorCS1 or SorCS3 gene may also be valuable in the field of pharmacogenomics. Some therapeutic agents are only effective in patients having a selected variant of a certain gene. In this embodiment, a subject in need of treatment provides a DNA sample from which the DNA sequences of SorCS1 and SorCS3 are determined. The outcome determines which therapeutic agent is administered to the patient.

[0024] From the perspective of prevention and treatment of type 2 diabetes, natural or non-natural ligands of SorCS1 or SorCS3 that modulate (i.e. stimulate or antagonize) the activity of the proteins are potential prevention and therapeutic agents. SorCS1 and SorCS3 are cell surface receptors which presumably trigger a cellular process. If this process can be stimulated artificially, the effect of the disease might be ameliorated. For example, when an individual does not produce enough natural ligands for SorCS1 or SorCS3, the natural ligand or an artificial ligand can be administered into the individual to bind to and increase the function of the receptor. In addition, if the SorCS1 or SorCS3 pathway does not function, increasing the activity by administering a ligand may help compensate for the lost function. Neurotensin, which binds to sortilin on the VPS10 domain, is expected to bind to SorCS1 and potentially can be used as a preventive or therapeutic agent for type 2 diabetes.

[0025] Other natural or non-natural ligands of SorCS1 or SorCS3 can be identified. Given that the cDNA and amino acid sequences of SorCS1 and 3 are known, one of ordinary skill can readily screen for agents that interact with SorCS1 or SorCS3. For example, one can use a cell culture system in which cells express SorCS1 or SorCS3. These cells can be exposed to a test agent and the presence and absence of an agent/SorCS1 complex is determined. It is well within the capability of one of ordinary skill in the art to make such a determination. An in vitro system in which a SorCS1 or SorCS3 protein can be exposed to a test agent directly can also be used to screen for ligands of SorCS1 or SorCS3. In the screening method described here, not only the human SorCS1 but also mouse SorCS1, or genes from other mammalian species can be used. Fragments of these proteins that include the VPS10 domain or other domains can also be used. Mouse SorCS1 mRNA and amino acid sequences are available at GenBank Accession No. AF195056.

[0026] All types of assays for identifying ligands and modulators of SorCS1 or SorCS3 are contemplated by the inventors. Such assays include, but are not limited to, assays which measure SorCS1 or SorCS3 biological activity, assays which measure expression of SorCS1 or SorCS3 (preferably employing the promoter gene sequence of these proteins linked to a reporter gene) or “in silico” assays which use computational models of the protein to predict compounds which will modulate the protein biological activity or expression. The assays are designed to identify ligands and modulators which are potential therapeutic agents, or analogs thereof, which have utility in the treatment of type II diabetes and related diseases.

[0027] mRNA or protein expression assays are also useful for identifying compounds which can modulate (i.e. up regulate or down regulate) expression of the gene, including compounds which modulate the activity of transcription regulators of SorCS1 or SorCS3. Such expression assays typically include an expression construct comprising the promoter region (5′UTR and associated genomic sequence) of the gene linked to a reporter gene. Potential therapeutic agents, or analogs thereof, are identified by their ability to modulate expression of the gene in question. Those skilled in the art are capable of identifying transcription factors which are responsible for regulating transcription of the gene in question.

[0028] Ligands and modulators identified for use as therapeutic (or prophylactic) agents can be of any composition. They are preferably orally available small molecule compounds. In an alternative embodiment, such compositions are selected from among small molecules, antisense molecules, siRNA, therapeutic antibodies and the like. In some embodiments a gene therapy vehicle (plasmid, viral or non-viral (lipid based) vector) may be used to deliver a copy of the SorCS1 gene to a cell for therapeutic expression of the respective proteins. Therapeutic compounds may be delivered orally, intravenously, by inhalation, and/or by any other of the means well known to those in the art.

[0029] The invention also includes a wide variety of tools for use in research which employ SorCS1 or SorCS3, such as but not limited to purified genes or proteins, recombinant cells containing additional copies of the gene(s), antibodies to the proteins (humanized, therapeutic or otherwise) and transgenic animals, such as mice created to have non-functional forms of the gene (knock-out or knock-down) or recombinant mice having additional copies of the gene(s).

[0030] EXAMPLE

[0031] As described in Stoehr, JP et al., Diabetes 49: 1946-1954 (2000), which is herein incorporated by reference in its entirety, when the t2dm2 locus on chromosome 19 of the C57BL/6 (B6) mouse strain was introduced into the BTBR mouse strain background to generate congenic mice, the non-diabetic BTBR mice became more severely diabetic. The inventors here generated a panel of interval specific congenic strains (ISC strains) for the t2dm2 locus on chromosome 19 of the B6 mouse in the BTBR background. The diabetic phenotype of the ISC strains were determined by measuring the fasting glucose levels. By comparing the overlapping t2dm2 locus fragments contained in different ISC strains and their phenotype, the genomic region that contains the type 2 diabetes susceptibility gene(s) was narrowed down to a 7 Mb fragment. Through searching the sequence information available from Celera and Public Genome Consortium, one gene of the size of about 0.5 Mb called SorCS3 was identified in the region. This gene is present in both mouse and the syntenic region of the human genome (chromosome 10). The full-length mRNA for this gene has been detected in both humans and mice. Close by SorCS3, in both human and mouse, a gene called SorCS1 that belongs to the same sortilin family as the SorCS3 gene was found. Both SorCS3 and SorCS1 were suspects for possible type 2 diabetes susceptibility genes found in this region.

[0032] Triglyceride levels in congenic mice of B6/7 Mb in the BTBR background were measured. The homozygotes, which were diabetic, were found to have a higher triglyceride level than the heterozygotes, which were non-diabetic.

[0033] Subsequent sequencing of the alleles of the SorCS1 and SorCS3 genes in the two strains led to the identification of SorCS1 as the responsible genetic differentiation. This conclusion was reached because the alleles of the SorCS3 gene in the two mouse strains was identical, where, but contrast, the alleles of the SorCS1 genes differed from each other by three nucleotides, as identified above in Table 1.

[0034] It appears that the activity of the SorCS1 protein may determine islet mass. Alternatively, the SorCS1 protein may affect insulin secretion in pancreatic beta cells or insulin degradation in the kidney or liver. Either of these will affect plasma insulin levels, which are altered in the congenic mice.

REFERENCES

[0035] 1. Kahn B B: Type 2 diabetes: when insulin secretion fails to compensate for insulin resistance. Cell 92:593-596, 1998

[0036] 2. Taylor S I: Deconstructing type 2 diabetes. Cell 97:9-12, 1999

[0037] 3. Polonsky K S, Sturis J, Bell G I: Non-insulin-dependent diabetes mellitus: a genetically programmed failure of the beta cell to compensate for insulin resistance. N Engl J Med 334:777-783, 1996

[0038] 4. Polonsky K S: The beta-cell in diabetes: from molecular genetics to clinical research. Diabetes 44:705-717, 1995

[0039] 5. Kahn C R, Vicent D, Doria A: Genetics of non-insulin-dependent (type-II) diabetes mellitus. Annu Rev Med 47:509-531, 1996

1 4 1 5757 DNA Human CDS (228)..(3893) 1 atcaccctct ggacaagaga acgggcgagc gggagctagg agggaagagt ggagaggacc 60 ggcgaggcgc gccagccgga gccacctcct tcccggccgc cccctcccca ctccccctac 120 acacacacgc tcgctcgctc gccggcgcgc gcacaccccc cgcgccggac ccgcacctcg 180 gcgggcgcca cacactcggc agcccgagcc gcggtagccg cagcggg atg gag gcg 236 Met Glu Ala 1 gcg cgc acg gag cgc ccc gca ggc agg ccg ggg gcg ccg ctt gtc cgg 284 Ala Arg Thr Glu Arg Pro Ala Gly Arg Pro Gly Ala Pro Leu Val Arg 5 10 15 acg ggg ctc cta ctc ttg tcg acg tgg gtc ctg gcc ggc gcc gag atc 332 Thr Gly Leu Leu Leu Leu Ser Thr Trp Val Leu Ala Gly Ala Glu Ile 20 25 30 35 act tgg gac gcg aca ggc ggt ccc gga cgc ccg gcg gcc ccg gct tcg 380 Thr Trp Asp Ala Thr Gly Gly Pro Gly Arg Pro Ala Ala Pro Ala Ser 40 45 50 cgg cca ccg gcg ttg tct cca ctc tcg ccg cgg gca gtg gcc agc cag 428 Arg Pro Pro Ala Leu Ser Pro Leu Ser Pro Arg Ala Val Ala Ser Gln 55 60 65 tgg ccg gag gag ctg gcg tcg gcg cgg aga gcc gcc gtg ctg ggg cgc 476 Trp Pro Glu Glu Leu Ala Ser Ala Arg Arg Ala Ala Val Leu Gly Arg 70 75 80 cgg gcc gga cca gag ctg ctg ccc cag cag ggc ggc ggc aga ggc ggt 524 Arg Ala Gly Pro Glu Leu Leu Pro Gln Gln Gly Gly Gly Arg Gly Gly 85 90 95 gag atg cag gtg gaa gcc gga ggg aca tca ccg gca ggc gag cgg cgg 572 Glu Met Gln Val Glu Ala Gly Gly Thr Ser Pro Ala Gly Glu Arg Arg 100 105 110 115 ggc cgg ggc atc cca gct cct gcc aag ctt ggc ggc gcg agg agg agt 620 Gly Arg Gly Ile Pro Ala Pro Ala Lys Leu Gly Gly Ala Arg Arg Ser 120 125 130 cgc cgg gcg cag ccc cca atc acc cag gaa cgc ggg gac gcc tgg gcc 668 Arg Arg Ala Gln Pro Pro Ile Thr Gln Glu Arg Gly Asp Ala Trp Ala 135 140 145 act gct ccg gcc gat ggt tcc aga gga agc cgt ccc ctt gct aag ggt 716 Thr Ala Pro Ala Asp Gly Ser Arg Gly Ser Arg Pro Leu Ala Lys Gly 150 155 160 tcc cgg gag gag gtg aag gcg ccg cgg gct ggg ggg tcg gcg gct gaa 764 Ser Arg Glu Glu Val Lys Ala Pro Arg Ala Gly Gly Ser Ala Ala Glu 165 170 175 gac ctc cgg ctg ccc agc acc tcc ttc gcg ctg acc ggg gac tcg gcc 812 Asp Leu Arg Leu Pro Ser Thr Ser Phe Ala Leu Thr Gly Asp Ser Ala 180 185 190 195 cac aac caa gcc atg gtg cac tgg tcg gga cac aac agc agc gtc ata 860 His Asn Gln Ala Met Val His Trp Ser Gly His Asn Ser Ser Val Ile 200 205 210 ctt atc ctg acg aag ctg tat gac ttc aac ctg ggc agc gtg act gag 908 Leu Ile Leu Thr Lys Leu Tyr Asp Phe Asn Leu Gly Ser Val Thr Glu 215 220 225 agt tca cta tgg agg tcg aca gat tat ggc acc acc tat gaa aag ctg 956 Ser Ser Leu Trp Arg Ser Thr Asp Tyr Gly Thr Thr Tyr Glu Lys Leu 230 235 240 aat gac aaa gtg ggt ttg aag act gtc ctc agt tac ctc tat gtc aat 1004 Asn Asp Lys Val Gly Leu Lys Thr Val Leu Ser Tyr Leu Tyr Val Asn 245 250 255 cca acc aac aaa agg aag att atg ctt ctc agt gat cct gag atg gag 1052 Pro Thr Asn Lys Arg Lys Ile Met Leu Leu Ser Asp Pro Glu Met Glu 260 265 270 275 agc agc ata ttg atc agc tca gac gaa ggg gcg acc tat cag aag tat 1100 Ser Ser Ile Leu Ile Ser Ser Asp Glu Gly Ala Thr Tyr Gln Lys Tyr 280 285 290 cgg ctc acc ttc tat atc cag agc ctg ctc ttt cat ccc aag caa gag 1148 Arg Leu Thr Phe Tyr Ile Gln Ser Leu Leu Phe His Pro Lys Gln Glu 295 300 305 gac tgg gtg ctg gcc tac agt ttg gat caa aag ctc tac agc tcc atg 1196 Asp Trp Val Leu Ala Tyr Ser Leu Asp Gln Lys Leu Tyr Ser Ser Met 310 315 320 gac ttt gga aga cgg tgg caa ctc atg cat gaa cgc atc aca ccc aac 1244 Asp Phe Gly Arg Arg Trp Gln Leu Met His Glu Arg Ile Thr Pro Asn 325 330 335 agg ttt tat tgg tcg gtg gcc gga ttg gat aag gag gcg gac ctg gtg 1292 Arg Phe Tyr Trp Ser Val Ala Gly Leu Asp Lys Glu Ala Asp Leu Val 340 345 350 355 cac atg gag gtg cgg acc acg gat gga tat gct cac tac ctc acc tgc 1340 His Met Glu Val Arg Thr Thr Asp Gly Tyr Ala His Tyr Leu Thr Cys 360 365 370 agg atc cag gaa tgt gcc gag aca act aga agt ggg cct ttt gcc cgc 1388 Arg Ile Gln Glu Cys Ala Glu Thr Thr Arg Ser Gly Pro Phe Ala Arg 375 380 385 tcc att gac atc agt tcc ctg gtt gtc cag gat gaa tat atc ttc att 1436 Ser Ile Asp Ile Ser Ser Leu Val Val Gln Asp Glu Tyr Ile Phe Ile 390 395 400 cag gta aca act agt gga aga gcc agc tac tac gtg tct tat cga aga 1484 Gln Val Thr Thr Ser Gly Arg Ala Ser Tyr Tyr Val Ser Tyr Arg Arg 405 410 415 gag gcc ttt gct cag ata aag ctg cct aag tac tcg ttg cca aag gac 1532 Glu Ala Phe Ala Gln Ile Lys Leu Pro Lys Tyr Ser Leu Pro Lys Asp 420 425 430 435 atg cac atc atc agt aca gac gag aac caa gta ttt gct gcg gtc caa 1580 Met His Ile Ile Ser Thr Asp Glu Asn Gln Val Phe Ala Ala Val Gln 440 445 450 gaa tgg aac cag aat gac acg tac aac ctc tac atc tca gac acg cgt 1628 Glu Trp Asn Gln Asn Asp Thr Tyr Asn Leu Tyr Ile Ser Asp Thr Arg 455 460 465 ggg att tac ttc act ctg gcc atg gag aac atc aag agc agc aga ggt 1676 Gly Ile Tyr Phe Thr Leu Ala Met Glu Asn Ile Lys Ser Ser Arg Gly 470 475 480 cta atg ggg aac atc att att gaa ttg tat gag gta gca ggt atc aaa 1724 Leu Met Gly Asn Ile Ile Ile Glu Leu Tyr Glu Val Ala Gly Ile Lys 485 490 495 ggg ata ttt ctg gca aac aag aag gtg gac gac cag gtg aag aca tac 1772 Gly Ile Phe Leu Ala Asn Lys Lys Val Asp Asp Gln Val Lys Thr Tyr 500 505 510 515 atc act tac aac aaa ggc agg gat tgg cgc ctg ctg caa gct ccg gat 1820 Ile Thr Tyr Asn Lys Gly Arg Asp Trp Arg Leu Leu Gln Ala Pro Asp 520 525 530 gtg gac ctg aga gga agc cca gtg cac tgc ctg ctg ccc ttc tgt tcc 1868 Val Asp Leu Arg Gly Ser Pro Val His Cys Leu Leu Pro Phe Cys Ser 535 540 545 tta cat ctg cac ctg caa ctc tct gaa aat cca tat tcc tca gga aga 1916 Leu His Leu His Leu Gln Leu Ser Glu Asn Pro Tyr Ser Ser Gly Arg 550 555 560 atc tct agc aag gag aca gcc cca gga ctt gtg gtg gct aca ggc aac 1964 Ile Ser Ser Lys Glu Thr Ala Pro Gly Leu Val Val Ala Thr Gly Asn 565 570 575 att ggc ccg gag ctc tca tat act gat att ggt gtg ttc atc tcc tcc 2012 Ile Gly Pro Glu Leu Ser Tyr Thr Asp Ile Gly Val Phe Ile Ser Ser 580 585 590 595 gat ggg ggc aac aca tgg aga cag atc ttt gat gaa gag tac aat gtc 2060 Asp Gly Gly Asn Thr Trp Arg Gln Ile Phe Asp Glu Glu Tyr Asn Val 600 605 610 tgg ttc cta gac tgg ggt ggt gcc ctc gtg gcc atg aaa cac aca cct 2108 Trp Phe Leu Asp Trp Gly Gly Ala Leu Val Ala Met Lys His Thr Pro 615 620 625 ctg cca gtc agg cat ttg tgg gtg agt ttt gat gag ggc cac tct tgg 2156 Leu Pro Val Arg His Leu Trp Val Ser Phe Asp Glu Gly His Ser Trp 630 635 640 gac aag tat ggt ttc act tcg gtt cct ctc ttt gtt gac ggg gct ctg 2204 Asp Lys Tyr Gly Phe Thr Ser Val Pro Leu Phe Val Asp Gly Ala Leu 645 650 655 gtg gag gca gga atg gag acc cac atc atg aca gtt ttt ggc cac ttc 2252 Val Glu Ala Gly Met Glu Thr His Ile Met Thr Val Phe Gly His Phe 660 665 670 675 agc ctc cgc tcc gaa tgg caa ttg gtg aaa gtg gac tac aaa tct atc 2300 Ser Leu Arg Ser Glu Trp Gln Leu Val Lys Val Asp Tyr Lys Ser Ile 680 685 690 ttc agc cgg cat tgc acc aag gag gac tat cag acc tgg cac ctg ctc 2348 Phe Ser Arg His Cys Thr Lys Glu Asp Tyr Gln Thr Trp His Leu Leu 695 700 705 aat cag gga gag cct tgt gtc atg gga gaa agg aaa ata ttc aag aaa 2396 Asn Gln Gly Glu Pro Cys Val Met Gly Glu Arg Lys Ile Phe Lys Lys 710 715 720 cgt aag cca gga gct cag tgt gcc ctg ggc cga gac cac tca gga tca 2444 Arg Lys Pro Gly Ala Gln Cys Ala Leu Gly Arg Asp His Ser Gly Ser 725 730 735 gtg gtc tca gaa ccc tgt gtc tgt gcc aat tgg gac ttc gag tgt gac 2492 Val Val Ser Glu Pro Cys Val Cys Ala Asn Trp Asp Phe Glu Cys Asp 740 745 750 755 tat ggg tat gag aga cat ggg gag agc cag tgt gtc cca gct ttc tgg 2540 Tyr Gly Tyr Glu Arg His Gly Glu Ser Gln Cys Val Pro Ala Phe Trp 760 765 770 tac aat cca gca tcc cca tca aag gac tgc agc ctt ggt caa agc tac 2588 Tyr Asn Pro Ala Ser Pro Ser Lys Asp Cys Ser Leu Gly Gln Ser Tyr 775 780 785 ctt aac agc act ggg tat cgg cgg att gtg tcc aac aac tgc aca gat 2636 Leu Asn Ser Thr Gly Tyr Arg Arg Ile Val Ser Asn Asn Cys Thr Asp 790 795 800 ggg cta agg gag aag tac acc gcc aag gcc cag atg tgc cct gga aaa 2684 Gly Leu Arg Glu Lys Tyr Thr Ala Lys Ala Gln Met Cys Pro Gly Lys 805 810 815 gcc cct cgg ggc ctc cat gtg gtg acg acc gat ggg cgg ctg gtg gca 2732 Ala Pro Arg Gly Leu His Val Val Thr Thr Asp Gly Arg Leu Val Ala 820 825 830 835 gag cag ggg cac aat gca act ttc atc atc ctc atg gag gag ggt gat 2780 Glu Gln Gly His Asn Ala Thr Phe Ile Ile Leu Met Glu Glu Gly Asp 840 845 850 cta caa agg aca aac atc cag ctt gac ttt ggg gat ggg att gct gtg 2828 Leu Gln Arg Thr Asn Ile Gln Leu Asp Phe Gly Asp Gly Ile Ala Val 855 860 865 tcc tac gca aac ttc agc ccc atc gag gac ggc atc aag cac gtg tat 2876 Ser Tyr Ala Asn Phe Ser Pro Ile Glu Asp Gly Ile Lys His Val Tyr 870 875 880 aag agt gcg ggg atc ttc cag gtg aca gcc tat gca gag aac aac ctt 2924 Lys Ser Ala Gly Ile Phe Gln Val Thr Ala Tyr Ala Glu Asn Asn Leu 885 890 895 ggc tca gac aca gct gtc ctc ttc ctg cat gtg gtt tgt cct gtg gag 2972 Gly Ser Asp Thr Ala Val Leu Phe Leu His Val Val Cys Pro Val Glu 900 905 910 915 cat gtt cat ctc cga gtt cca ttt gtt gcc ata aga aat aag gag gtc 3020 His Val His Leu Arg Val Pro Phe Val Ala Ile Arg Asn Lys Glu Val 920 925 930 aac atc agt gca gtc gtg tgg ccc agt caa ctg ggg acc ctt acc tat 3068 Asn Ile Ser Ala Val Val Trp Pro Ser Gln Leu Gly Thr Leu Thr Tyr 935 940 945 ttc tgg tgg ttc ggc aat agc aca aag cct ctc atc act ttg gac agc 3116 Phe Trp Trp Phe Gly Asn Ser Thr Lys Pro Leu Ile Thr Leu Asp Ser 950 955 960 agc att tcc ttc aca ttc ctt gca gaa gga acc gac acc atc aca gtc 3164 Ser Ile Ser Phe Thr Phe Leu Ala Glu Gly Thr Asp Thr Ile Thr Val 965 970 975 cag gtg gct gct ggg aat gcc ctc atc cag gac aca aaa gag att gca 3212 Gln Val Ala Ala Gly Asn Ala Leu Ile Gln Asp Thr Lys Glu Ile Ala 980 985 990 995 gtt cat gaa tat ttc cag tcc cag ctt tta tca ttc tct cct aat ctg 3260 Val His Glu Tyr Phe Gln Ser Gln Leu Leu Ser Phe Ser Pro Asn Leu 1000 1005 1010 gat tac cac aat cct gac att cct gag tgg aga aaa gat att ggc aat 3308 Asp Tyr His Asn Pro Asp Ile Pro Glu Trp Arg Lys Asp Ile Gly Asn 1015 1020 1025 gtc atc aag cga gct ctg gtt aaa gta acc agt gtc cca gag gac cag 3356 Val Ile Lys Arg Ala Leu Val Lys Val Thr Ser Val Pro Glu Asp Gln 1030 1035 1040 atc ctc att gcc gtg ttt cct ggt ctc ccc act tca gca gag ctt ttc 3404 Ile Leu Ile Ala Val Phe Pro Gly Leu Pro Thr Ser Ala Glu Leu Phe 1045 1050 1055 att ctt cca ccc aag aac ctg aca gag agg agg aaa ggc aat gaa ggg 3452 Ile Leu Pro Pro Lys Asn Leu Thr Glu Arg Arg Lys Gly Asn Glu Gly 1060 1065 1070 1075 gac ctg gaa caa att gta gaa aca ctg ttt aat gct ctc aac caa aat 3500 Asp Leu Glu Gln Ile Val Glu Thr Leu Phe Asn Ala Leu Asn Gln Asn 1080 1085 1090 ttg gtc cag ttt gag ctg aag ccg ggg gta caa gtc att gtg tat gtc 3548 Leu Val Gln Phe Glu Leu Lys Pro Gly Val Gln Val Ile Val Tyr Val 1095 1100 1105 aca cag ctg acg tta gct cca ttg gtg gac tcc agt gct ggg cac agc 3596 Thr Gln Leu Thr Leu Ala Pro Leu Val Asp Ser Ser Ala Gly His Ser 1110 1115 1120 agc tca gcc atg ctt atg cta tta tca gtg gta ttt gtt ggc ctg gct 3644 Ser Ser Ala Met Leu Met Leu Leu Ser Val Val Phe Val Gly Leu Ala 1125 1130 1135 gtg ttt ttg atc tac aag ttt aaa agg aaa atc cct tgg att aac atc 3692 Val Phe Leu Ile Tyr Lys Phe Lys Arg Lys Ile Pro Trp Ile Asn Ile 1140 1145 1150 1155 tat gct caa gtc caa cac gac aag gag cag gag atg att ggg tca gtg 3740 Tyr Ala Gln Val Gln His Asp Lys Glu Gln Glu Met Ile Gly Ser Val 1160 1165 1170 agc caa agt gaa aac gcc ccc aaa atc aca ctc agt gac ttt acg gag 3788 Ser Gln Ser Glu Asn Ala Pro Lys Ile Thr Leu Ser Asp Phe Thr Glu 1175 1180 1185 cct gag gag ctg ctg gac aaa gag ctg gac acg cgg gtc ata gga ggc 3836 Pro Glu Glu Leu Leu Asp Lys Glu Leu Asp Thr Arg Val Ile Gly Gly 1190 1195 1200 att gcc act att gca aac agc gaa agc aca aag gag atc ccc aac tgc 3884 Ile Ala Thr Ile Ala Asn Ser Glu Ser Thr Lys Glu Ile Pro Asn Cys 1205 1210 1215 act agt gtt taataccagc aagccacgtg gtcaaccacc tttctgactt 3933 Thr Ser Val 1220 tttatttttg atgattacta ttactattat tatggaaaaa ttaaaatgtc ttttttacct 3993 tttgtttacc aagggcccct tcataaatag caggcaaatg cctagctttg ggagaaaagg 4053 gcattcttag ctgattgaaa tgagacaaag ggaataaatg gctgtatttg tgctaagagc 4113 aaaggatgca tcttcccaca gcctcctcgc tttactctgc cattggtagc ttaaagactt 4173 tctttttcct tgtggtctcc cttttttcaa aattgaagtt gggttggctc tttgtgaacc 4233 tctcatcccc acagcagaat caccaacact ctccgcttcc cccagcacac acacatacaa 4293 cacagatcat ttcccagtta gatccgcagg aagtaggttg gtgggggtgg atgtagctgc 4353 agaaagcatg cacaactttg tgaaagaggc cctgccttgt gcatgtccat agtgaggcta 4413 cagatggctt attgtatata attacaatgt aaatagcttt ttatttccta agaaataatt 4473 taatgtttag taaaaaagaa aacagaaaaa agaaagatgc gtgtgttggc ttacgcactg 4533 gccctcagag ctgaccaacc cgccaggcct gctcaatgca ttgggtttgg atgctctcct 4593 gttgtctgtc acacttaact cttgcatctc cttgtccatg ccatagctgg tttctactta 4653 tgtatataaa ggggggtggg gggaggggct tctctggggc aattgataaa ggaaggactc 4713 tagtgacatc atagaacatg gcagtcgttt ttgttccaag aatgatatga aaggtgaaga 4773 agaggcccac tagaggcttc atactgagac ccagatgggg gaaaacagct tcctctctaa 4833 aaggaaaaac ttgatattta tcagtctgag aaaatatttt tttctaaaga aggcagtcag 4893 tggatcttaa aatgacaatc tgtttttaaa ttggattcta tgaaaatgca taatgcttat 4953 ggtgaattct caggctattc tgagctcaga aaagtcccct gggcactagg taaagcccag 5013 tgaatgtctc ttggcatggg aggagttaaa gaggttggaa gggaagaggc atttgtggaa 5073 ttatgagttc atgcaaaact ctccaggcca agtaggggtc tagcctttaa tgatattagt 5133 caaaggcaat tttagcaaag ctgtgctatt tgcttgtcag atgtacacaa cttccttaaa 5193 gtcaaatgtc tgccttcagt tcccttaagg tagttcttgc ctctggggtg agtggctttc 5253 aaagcctttt agcttttcca gcacctcagc cccttcacac atttacacat accaattttt 5313 ttcaataggg tcacgttaag ccatgctgta agcattgttt ttattttcag gcttagcctg 5373 agcacactta tttttgaaaa tgatataatg tatatatatg ggaggaaagg ccacattttg 5433 tacctgttaa tttttgtggg atgttgttcc cattcttctt tgtgagacag agagaatgtg 5493 atatagagaa atctggctgg ctacagtgta gatcagtatt aggaatattt ctaaagatcc 5553 tgcttttttg tttcaagggt taaatggggc agacaattgc aatacttgta ctaaacactg 5613 gaatacaaat gcatgactca tatctatata tacagtatat gtacatatac tgttcttggt 5673 tttattgttc cacttgaata tttctactgt aaaaaaaaga cagtggtttt gaaattgttg 5733 aaaataaatg tatttttgta catc 5757 2 1222 PRT Human 2 Met Glu Ala Ala Arg Thr Glu Arg Pro Ala Gly Arg Pro Gly Ala Pro 1 5 10 15 Leu Val Arg Thr Gly Leu Leu Leu Leu Ser Thr Trp Val Leu Ala Gly 20 25 30 Ala Glu Ile Thr Trp Asp Ala Thr Gly Gly Pro Gly Arg Pro Ala Ala 35 40 45 Pro Ala Ser Arg Pro Pro Ala Leu Ser Pro Leu Ser Pro Arg Ala Val 50 55 60 Ala Ser Gln Trp Pro Glu Glu Leu Ala Ser Ala Arg Arg Ala Ala Val 65 70 75 80 Leu Gly Arg Arg Ala Gly Pro Glu Leu Leu Pro Gln Gln Gly Gly Gly 85 90 95 Arg Gly Gly Glu Met Gln Val Glu Ala Gly Gly Thr Ser Pro Ala Gly 100 105 110 Glu Arg Arg Gly Arg Gly Ile Pro Ala Pro Ala Lys Leu Gly Gly Ala 115 120 125 Arg Arg Ser Arg Arg Ala Gln Pro Pro Ile Thr Gln Glu Arg Gly Asp 130 135 140 Ala Trp Ala Thr Ala Pro Ala Asp Gly Ser Arg Gly Ser Arg Pro Leu 145 150 155 160 Ala Lys Gly Ser Arg Glu Glu Val Lys Ala Pro Arg Ala Gly Gly Ser 165 170 175 Ala Ala Glu Asp Leu Arg Leu Pro Ser Thr Ser Phe Ala Leu Thr Gly 180 185 190 Asp Ser Ala His Asn Gln Ala Met Val His Trp Ser Gly His Asn Ser 195 200 205 Ser Val Ile Leu Ile Leu Thr Lys Leu Tyr Asp Phe Asn Leu Gly Ser 210 215 220 Val Thr Glu Ser Ser Leu Trp Arg Ser Thr Asp Tyr Gly Thr Thr Tyr 225 230 235 240 Glu Lys Leu Asn Asp Lys Val Gly Leu Lys Thr Val Leu Ser Tyr Leu 245 250 255 Tyr Val Asn Pro Thr Asn Lys Arg Lys Ile Met Leu Leu Ser Asp Pro 260 265 270 Glu Met Glu Ser Ser Ile Leu Ile Ser Ser Asp Glu Gly Ala Thr Tyr 275 280 285 Gln Lys Tyr Arg Leu Thr Phe Tyr Ile Gln Ser Leu Leu Phe His Pro 290 295 300 Lys Gln Glu Asp Trp Val Leu Ala Tyr Ser Leu Asp Gln Lys Leu Tyr 305 310 315 320 Ser Ser Met Asp Phe Gly Arg Arg Trp Gln Leu Met His Glu Arg Ile 325 330 335 Thr Pro Asn Arg Phe Tyr Trp Ser Val Ala Gly Leu Asp Lys Glu Ala 340 345 350 Asp Leu Val His Met Glu Val Arg Thr Thr Asp Gly Tyr Ala His Tyr 355 360 365 Leu Thr Cys Arg Ile Gln Glu Cys Ala Glu Thr Thr Arg Ser Gly Pro 370 375 380 Phe Ala Arg Ser Ile Asp Ile Ser Ser Leu Val Val Gln Asp Glu Tyr 385 390 395 400 Ile Phe Ile Gln Val Thr Thr Ser Gly Arg Ala Ser Tyr Tyr Val Ser 405 410 415 Tyr Arg Arg Glu Ala Phe Ala Gln Ile Lys Leu Pro Lys Tyr Ser Leu 420 425 430 Pro Lys Asp Met His Ile Ile Ser Thr Asp Glu Asn Gln Val Phe Ala 435 440 445 Ala Val Gln Glu Trp Asn Gln Asn Asp Thr Tyr Asn Leu Tyr Ile Ser 450 455 460 Asp Thr Arg Gly Ile Tyr Phe Thr Leu Ala Met Glu Asn Ile Lys Ser 465 470 475 480 Ser Arg Gly Leu Met Gly Asn Ile Ile Ile Glu Leu Tyr Glu Val Ala 485 490 495 Gly Ile Lys Gly Ile Phe Leu Ala Asn Lys Lys Val Asp Asp Gln Val 500 505 510 Lys Thr Tyr Ile Thr Tyr Asn Lys Gly Arg Asp Trp Arg Leu Leu Gln 515 520 525 Ala Pro Asp Val Asp Leu Arg Gly Ser Pro Val His Cys Leu Leu Pro 530 535 540 Phe Cys Ser Leu His Leu His Leu Gln Leu Ser Glu Asn Pro Tyr Ser 545 550 555 560 Ser Gly Arg Ile Ser Ser Lys Glu Thr Ala Pro Gly Leu Val Val Ala 565 570 575 Thr Gly Asn Ile Gly Pro Glu Leu Ser Tyr Thr Asp Ile Gly Val Phe 580 585 590 Ile Ser Ser Asp Gly Gly Asn Thr Trp Arg Gln Ile Phe Asp Glu Glu 595 600 605 Tyr Asn Val Trp Phe Leu Asp Trp Gly Gly Ala Leu Val Ala Met Lys 610 615 620 His Thr Pro Leu Pro Val Arg His Leu Trp Val Ser Phe Asp Glu Gly 625 630 635 640 His Ser Trp Asp Lys Tyr Gly Phe Thr Ser Val Pro Leu Phe Val Asp 645 650 655 Gly Ala Leu Val Glu Ala Gly Met Glu Thr His Ile Met Thr Val Phe 660 665 670 Gly His Phe Ser Leu Arg Ser Glu Trp Gln Leu Val Lys Val Asp Tyr 675 680 685 Lys Ser Ile Phe Ser Arg His Cys Thr Lys Glu Asp Tyr Gln Thr Trp 690 695 700 His Leu Leu Asn Gln Gly Glu Pro Cys Val Met Gly Glu Arg Lys Ile 705 710 715 720 Phe Lys Lys Arg Lys Pro Gly Ala Gln Cys Ala Leu Gly Arg Asp His 725 730 735 Ser Gly Ser Val Val Ser Glu Pro Cys Val Cys Ala Asn Trp Asp Phe 740 745 750 Glu Cys Asp Tyr Gly Tyr Glu Arg His Gly Glu Ser Gln Cys Val Pro 755 760 765 Ala Phe Trp Tyr Asn Pro Ala Ser Pro Ser Lys Asp Cys Ser Leu Gly 770 775 780 Gln Ser Tyr Leu Asn Ser Thr Gly Tyr Arg Arg Ile Val Ser Asn Asn 785 790 795 800 Cys Thr Asp Gly Leu Arg Glu Lys Tyr Thr Ala Lys Ala Gln Met Cys 805 810 815 Pro Gly Lys Ala Pro Arg Gly Leu His Val Val Thr Thr Asp Gly Arg 820 825 830 Leu Val Ala Glu Gln Gly His Asn Ala Thr Phe Ile Ile Leu Met Glu 835 840 845 Glu Gly Asp Leu Gln Arg Thr Asn Ile Gln Leu Asp Phe Gly Asp Gly 850 855 860 Ile Ala Val Ser Tyr Ala Asn Phe Ser Pro Ile Glu Asp Gly Ile Lys 865 870 875 880 His Val Tyr Lys Ser Ala Gly Ile Phe Gln Val Thr Ala Tyr Ala Glu 885 890 895 Asn Asn Leu Gly Ser Asp Thr Ala Val Leu Phe Leu His Val Val Cys 900 905 910 Pro Val Glu His Val His Leu Arg Val Pro Phe Val Ala Ile Arg Asn 915 920 925 Lys Glu Val Asn Ile Ser Ala Val Val Trp Pro Ser Gln Leu Gly Thr 930 935 940 Leu Thr Tyr Phe Trp Trp Phe Gly Asn Ser Thr Lys Pro Leu Ile Thr 945 950 955 960 Leu Asp Ser Ser Ile Ser Phe Thr Phe Leu Ala Glu Gly Thr Asp Thr 965 970 975 Ile Thr Val Gln Val Ala Ala Gly Asn Ala Leu Ile Gln Asp Thr Lys 980 985 990 Glu Ile Ala Val His Glu Tyr Phe Gln Ser Gln Leu Leu Ser Phe Ser 995 1000 1005 Pro Asn Leu Asp Tyr His Asn Pro Asp Ile Pro Glu Trp Arg Lys Asp 1010 1015 1020 Ile Gly Asn Val Ile Lys Arg Ala Leu Val Lys Val Thr Ser Val Pro 1025 1030 1035 1040 Glu Asp Gln Ile Leu Ile Ala Val Phe Pro Gly Leu Pro Thr Ser Ala 1045 1050 1055 Glu Leu Phe Ile Leu Pro Pro Lys Asn Leu Thr Glu Arg Arg Lys Gly 1060 1065 1070 Asn Glu Gly Asp Leu Glu Gln Ile Val Glu Thr Leu Phe Asn Ala Leu 1075 1080 1085 Asn Gln Asn Leu Val Gln Phe Glu Leu Lys Pro Gly Val Gln Val Ile 1090 1095 1100 Val Tyr Val Thr Gln Leu Thr Leu Ala Pro Leu Val Asp Ser Ser Ala 1105 1110 1115 1120 Gly His Ser Ser Ser Ala Met Leu Met Leu Leu Ser Val Val Phe Val 1125 1130 1135 Gly Leu Ala Val Phe Leu Ile Tyr Lys Phe Lys Arg Lys Ile Pro Trp 1140 1145 1150 Ile Asn Ile Tyr Ala Gln Val Gln His Asp Lys Glu Gln Glu Met Ile 1155 1160 1165 Gly Ser Val Ser Gln Ser Glu Asn Ala Pro Lys Ile Thr Leu Ser Asp 1170 1175 1180 Phe Thr Glu Pro Glu Glu Leu Leu Asp Lys Glu Leu Asp Thr Arg Val 1185 1190 1195 1200 Ile Gly Gly Ile Ala Thr Ile Ala Asn Ser Glu Ser Thr Lys Glu Ile 1205 1210 1215 Pro Asn Cys Thr Ser Val 1220 3 7272 DNA Human CDS (9)..(3512) 3 ctcccgcg atg gga aaa gtt ggc gcc ggc ggc ggc tcc caa gcc cgg ctg 50 Met Gly Lys Val Gly Ala Gly Gly Gly Ser Gln Ala Arg Leu 1 5 10 agc gcg ctc ctc gcc ggc gcg ggg ctc ttg atc ctc tgc gcc ccg ggc 98 Ser Ala Leu Leu Ala Gly Ala Gly Leu Leu Ile Leu Cys Ala Pro Gly 15 20 25 30 gtc tgc ggc ggc ggc tcc tgc tgc ccc tcg ccg cac ccc agc tcc gct 146 Val Cys Gly Gly Gly Ser Cys Cys Pro Ser Pro His Pro Ser Ser Ala 35 40 45 cca cgc tcg gcc tcg acc cct agg ggc ttt tcc cac cag ggg cgg cca 194 Pro Arg Ser Ala Ser Thr Pro Arg Gly Phe Ser His Gln Gly Arg Pro 50 55 60 ggc agg gct cct gcc acg ccc ctg ccc ctc gta gtg cgt ccc ctg ttc 242 Gly Arg Ala Pro Ala Thr Pro Leu Pro Leu Val Val Arg Pro Leu Phe 65 70 75 tca gtg gcc ccc ggg gac cga gcg cta tcc ctg gag cgg gct cgg ggc 290 Ser Val Ala Pro Gly Asp Arg Ala Leu Ser Leu Glu Arg Ala Arg Gly 80 85 90 act ggg gca tcc atg gcg gtt gct gca cgc tcc ggc cgg agg aga cgg 338 Thr Gly Ala Ser Met Ala Val Ala Ala Arg Ser Gly Arg Arg Arg Arg 95 100 105 110 agc gga gcg gat cag gag aag gca gaa cgg gga gag ggc gcg agt cgg 386 Ser Gly Ala Asp Gln Glu Lys Ala Glu Arg Gly Glu Gly Ala Ser Arg 115 120 125 agc ccc cgg gga gtg cta aga gat gga ggg cag cag gag cct ggg act 434 Ser Pro Arg Gly Val Leu Arg Asp Gly Gly Gln Gln Glu Pro Gly Thr 130 135 140 cgg gag cgg gac ccg gac aaa gcc acc cgc ttc cgg atg gag gag ctg 482 Arg Glu Arg Asp Pro Asp Lys Ala Thr Arg Phe Arg Met Glu Glu Leu 145 150 155 aga ctg acc agc acc acg ttt gcg ctg acg gga gac tca gca cac aac 530 Arg Leu Thr Ser Thr Thr Phe Ala Leu Thr Gly Asp Ser Ala His Asn 160 165 170 caa gcc atg gtc cac tgg tct ggc cac aac agc agc gtg att ctc att 578 Gln Ala Met Val His Trp Ser Gly His Asn Ser Ser Val Ile Leu Ile 175 180 185 190 ttg aca aag ctc tat gac tat aac ctg ggg agc atc aca gag agc tcg 626 Leu Thr Lys Leu Tyr Asp Tyr Asn Leu Gly Ser Ile Thr Glu Ser Ser 195 200 205 ctt tgg agg tca acc gat tat gga aca acc tat gag aag ctg aat gat 674 Leu Trp Arg Ser Thr Asp Tyr Gly Thr Thr Tyr Glu Lys Leu Asn Asp 210 215 220 aaa gtt ggt ttg aag acc att ttg ggc tat ctc tat gtg tgt cct acc 722 Lys Val Gly Leu Lys Thr Ile Leu Gly Tyr Leu Tyr Val Cys Pro Thr 225 230 235 aac aag cgt aag ata atg tta ctc aca gac ccg gag att gag agc agt 770 Asn Lys Arg Lys Ile Met Leu Leu Thr Asp Pro Glu Ile Glu Ser Ser 240 245 250 tta ttg atc agc tca gat gaa ggg gca act tat caa aag tac cgg ctg 818 Leu Leu Ile Ser Ser Asp Glu Gly Ala Thr Tyr Gln Lys Tyr Arg Leu 255 260 265 270 aac ttc tac att caa agc ttg ctt ttt cac ccc aaa caa gaa gac tgg 866 Asn Phe Tyr Ile Gln Ser Leu Leu Phe His Pro Lys Gln Glu Asp Trp 275 280 285 att ctg gca tac agt caa gac caa aag tta tac agc tct gct gaa ttt 914 Ile Leu Ala Tyr Ser Gln Asp Gln Lys Leu Tyr Ser Ser Ala Glu Phe 290 295 300 ggg aga aga tgg cag ctt atc caa gaa ggg gtt gta cca aac agg ttc 962 Gly Arg Arg Trp Gln Leu Ile Gln Glu Gly Val Val Pro Asn Arg Phe 305 310 315 tac tgg tct gtg atg ggg tca aat aaa gaa cca gac ctt gtg cat ctt 1010 Tyr Trp Ser Val Met Gly Ser Asn Lys Glu Pro Asp Leu Val His Leu 320 325 330 gag gcc aga act gtg gat ggt cat tca cat tat cta act tgc cga atg 1058 Glu Ala Arg Thr Val Asp Gly His Ser His Tyr Leu Thr Cys Arg Met 335 340 345 350 cag aac tgt aca gag gcc aac agg aat cag cct ttt cca ggc tac att 1106 Gln Asn Cys Thr Glu Ala Asn Arg Asn Gln Pro Phe Pro Gly Tyr Ile 355 360 365 gac cca gac tct ttg att gtt cag gat cat tat gtg ttt gtt cag ctg 1154 Asp Pro Asp Ser Leu Ile Val Gln Asp His Tyr Val Phe Val Gln Leu 370 375 380 aca tca gga ggg cgg cca cat tac tac gtg tcc tac cga agg aat gca 1202 Thr Ser Gly Gly Arg Pro His Tyr Tyr Val Ser Tyr Arg Arg Asn Ala 385 390 395 ttt gcc caa atg aag ctt ccg aaa tat gct ttg ccc aag gac atg cat 1250 Phe Ala Gln Met Lys Leu Pro Lys Tyr Ala Leu Pro Lys Asp Met His 400 405 410 gtt atc agc acc gat gag aat cag gtg ttc gca gcg gtc caa gaa tgg 1298 Val Ile Ser Thr Asp Glu Asn Gln Val Phe Ala Ala Val Gln Glu Trp 415 420 425 430 aac cag aat gac acg tac aac ctc tac atc tca gac aca cgt ggt gtc 1346 Asn Gln Asn Asp Thr Tyr Asn Leu Tyr Ile Ser Asp Thr Arg Gly Val 435 440 445 tac ttc acc ctg gcc ttg gag aat gtc cag agc agc aga ggc cct gag 1394 Tyr Phe Thr Leu Ala Leu Glu Asn Val Gln Ser Ser Arg Gly Pro Glu 450 455 460 ggc aac atc atg atc gac ctc tat gag gta gca ggg ata aag gga atg 1442 Gly Asn Ile Met Ile Asp Leu Tyr Glu Val Ala Gly Ile Lys Gly Met 465 470 475 ttc ttg gct aac aag aag att gac aac caa gtg aag act ttc atc aca 1490 Phe Leu Ala Asn Lys Lys Ile Asp Asn Gln Val Lys Thr Phe Ile Thr 480 485 490 tat aac aaa ggc aga gac tgg cgt ttg ctg cag gcg ccg gac acg gat 1538 Tyr Asn Lys Gly Arg Asp Trp Arg Leu Leu Gln Ala Pro Asp Thr Asp 495 500 505 510 cta agg ggg gac ccc gtg cac tgc ttg ctg ccc tat tgc tca cta cac 1586 Leu Arg Gly Asp Pro Val His Cys Leu Leu Pro Tyr Cys Ser Leu His 515 520 525 ctt cac ctg aag gtc tct gag aat ccc tac aca tca ggg atc att gcc 1634 Leu His Leu Lys Val Ser Glu Asn Pro Tyr Thr Ser Gly Ile Ile Ala 530 535 540 agc aaa gac aca gct cca agc atc ata gtg gca tca ggt aat ata ggt 1682 Ser Lys Asp Thr Ala Pro Ser Ile Ile Val Ala Ser Gly Asn Ile Gly 545 550 555 tct gaa ttg tca gac act gac atc agc atg ttt gtc tct tca gat gca 1730 Ser Glu Leu Ser Asp Thr Asp Ile Ser Met Phe Val Ser Ser Asp Ala 560 565 570 ggg aac acc tgg aga cag atc ttt gaa gaa gag cac agt gtt ttg tac 1778 Gly Asn Thr Trp Arg Gln Ile Phe Glu Glu Glu His Ser Val Leu Tyr 575 580 585 590 ctg gat caa ggt gga gtc ctg gtt gct atg aaa cac aca tct ctc cca 1826 Leu Asp Gln Gly Gly Val Leu Val Ala Met Lys His Thr Ser Leu Pro 595 600 605 att cga cat ctt tgg ttg agt ttt gat gaa ggg aga tct tgg agc aaa 1874 Ile Arg His Leu Trp Leu Ser Phe Asp Glu Gly Arg Ser Trp Ser Lys 610 615 620 tac agt ttc aca tct att cca ctt ttt gtg gat ggg gtt ctg ggt gag 1922 Tyr Ser Phe Thr Ser Ile Pro Leu Phe Val Asp Gly Val Leu Gly Glu 625 630 635 cct gga gaa gag act ctc atc atg aca gtg ttt gga cac ttc agc cac 1970 Pro Gly Glu Glu Thr Leu Ile Met Thr Val Phe Gly His Phe Ser His 640 645 650 cgc tct gaa tgg cag ctg gtc aaa gta gat tac aag tcc att ttt gat 2018 Arg Ser Glu Trp Gln Leu Val Lys Val Asp Tyr Lys Ser Ile Phe Asp 655 660 665 670 aga cgg tgt gcc gaa gag gac tac aga cct tgg cag ctg cac agc cag 2066 Arg Arg Cys Ala Glu Glu Asp Tyr Arg Pro Trp Gln Leu His Ser Gln 675 680 685 ggg gaa gca tgt atc atg gga gca aaa agg ata tat aag aag cga aaa 2114 Gly Glu Ala Cys Ile Met Gly Ala Lys Arg Ile Tyr Lys Lys Arg Lys 690 695 700 tca gag cgg aag tgt atg caa gga aaa tat gca gga gct atg gaa tct 2162 Ser Glu Arg Lys Cys Met Gln Gly Lys Tyr Ala Gly Ala Met Glu Ser 705 710 715 gaa ccc tgt gtc tgc act gag gct gat ttt gat tgc gac tat ggt tat 2210 Glu Pro Cys Val Cys Thr Glu Ala Asp Phe Asp Cys Asp Tyr Gly Tyr 720 725 730 gag cga cac agc aat ggc cag tgc ctg ccg gca ttt tgg ttc aat cca 2258 Glu Arg His Ser Asn Gly Gln Cys Leu Pro Ala Phe Trp Phe Asn Pro 735 740 745 750 tcc tct ctg tca aag gat tgc agc ttg gga cag agt tac ctc aat agt 2306 Ser Ser Leu Ser Lys Asp Cys Ser Leu Gly Gln Ser Tyr Leu Asn Ser 755 760 765 act ggg tac agg aag gtg gtt tcc aat aat tgc act gat ggc gta agg 2354 Thr Gly Tyr Arg Lys Val Val Ser Asn Asn Cys Thr Asp Gly Val Arg 770 775 780 gaa cag tac act gcc aaa ccg cag aag tgc cca ggg aaa gcc ccg cgg 2402 Glu Gln Tyr Thr Ala Lys Pro Gln Lys Cys Pro Gly Lys Ala Pro Arg 785 790 795 ggg ctg cgg ata gtc acg gct gat gga aag ctg aca gcg gaa caa gga 2450 Gly Leu Arg Ile Val Thr Ala Asp Gly Lys Leu Thr Ala Glu Gln Gly 800 805 810 cac aac gtc act ctc atg gtg caa tta gaa gag ggt gat gtt cag cgg 2498 His Asn Val Thr Leu Met Val Gln Leu Glu Glu Gly Asp Val Gln Arg 815 820 825 830 aca ctc atc caa gtg gac ttt ggc gat ggt atc gcg gtg tct tac gtc 2546 Thr Leu Ile Gln Val Asp Phe Gly Asp Gly Ile Ala Val Ser Tyr Val 835 840 845 aat ctc agc tcc atg gaa gat ggg atc aaa cac gtc tat cag aac gtg 2594 Asn Leu Ser Ser Met Glu Asp Gly Ile Lys His Val Tyr Gln Asn Val 850 855 860 ggc att ttc cgt gtg acc gtg cag gtg gac aac agt ctg ggt tct gac 2642 Gly Ile Phe Arg Val Thr Val Gln Val Asp Asn Ser Leu Gly Ser Asp 865 870 875 agc gcc gtc ctg tac tta cat gta act tgt ccc ttg gag cac gtg cac 2690 Ser Ala Val Leu Tyr Leu His Val Thr Cys Pro Leu Glu His Val His 880 885 890 ctg tct ctt ccc ttt gtc acc aca aag aac aaa gag gtc aat gcg acg 2738 Leu Ser Leu Pro Phe Val Thr Thr Lys Asn Lys Glu Val Asn Ala Thr 895 900 905 910 gca gtg ctg tgg ccc agc caa gtg ggc acc ctc act tac gtg tgg tgg 2786 Ala Val Leu Trp Pro Ser Gln Val Gly Thr Leu Thr Tyr Val Trp Trp 915 920 925 tac gga aac aac acg gag cct ttg atc acc ttg gag gga agc ata tcc 2834 Tyr Gly Asn Asn Thr Glu Pro Leu Ile Thr Leu Glu Gly Ser Ile Ser 930 935 940 ttc aga ttt act tca gaa gga atg aat acc atc aca gtg cag gtc tca 2882 Phe Arg Phe Thr Ser Glu Gly Met Asn Thr Ile Thr Val Gln Val Ser 945 950 955 gct ggg aat gcc atc cta caa gac aca aag acc atc gca gta tat gag 2930 Ala Gly Asn Ala Ile Leu Gln Asp Thr Lys Thr Ile Ala Val Tyr Glu 960 965 970 gaa ttc cgg tct ctt cgc ttg tcc ttt tct cca aac ctg gat gac tac 2978 Glu Phe Arg Ser Leu Arg Leu Ser Phe Ser Pro Asn Leu Asp Asp Tyr 975 980 985 990 aac ccg gac atc cct gag tgg agg agg gac atc ggt cga gtc atc aaa 3026 Asn Pro Asp Ile Pro Glu Trp Arg Arg Asp Ile Gly Arg Val Ile Lys 995 1000 1005 aaa tcc ctg gtg gaa gcc aca ggg gtt cca ggc cag cac atc ctg gtg 3074 Lys Ser Leu Val Glu Ala Thr Gly Val Pro Gly Gln His Ile Leu Val 1010 1015 1020 gcg gtg ctc cct ggc tta ccc acc act gct gaa ctc ttt gtc cta ccc 3122 Ala Val Leu Pro Gly Leu Pro Thr Thr Ala Glu Leu Phe Val Leu Pro 1025 1030 1035 tat cag gat cca gct gga gaa aac aaa agg tca act gat gac ctg gag 3170 Tyr Gln Asp Pro Ala Gly Glu Asn Lys Arg Ser Thr Asp Asp Leu Glu 1040 1045 1050 cag ata tca gaa ttg ctg atc cac acg ctc aac caa aac tca gta cac 3218 Gln Ile Ser Glu Leu Leu Ile His Thr Leu Asn Gln Asn Ser Val His 1055 1060 1065 1070 ttc gag ctg aag cca gga gtc cga gtc ctt gtc cat gct gct cac tta 3266 Phe Glu Leu Lys Pro Gly Val Arg Val Leu Val His Ala Ala His Leu 1075 1080 1085 aca gcg gcc ccc ctg gtg gac ctc act cca acc cac agt gga tct gcc 3314 Thr Ala Ala Pro Leu Val Asp Leu Thr Pro Thr His Ser Gly Ser Ala 1090 1095 1100 atg ctg atg ctg ctc tca gtg gtg ttt gtg ggg ctg gca gtg ttc gtc 3362 Met Leu Met Leu Leu Ser Val Val Phe Val Gly Leu Ala Val Phe Val 1105 1110 1115 atc tac aag ttt aaa agg aga gta gct tta ccc tcc cct ccc tcc cct 3410 Ile Tyr Lys Phe Lys Arg Arg Val Ala Leu Pro Ser Pro Pro Ser Pro 1120 1125 1130 tct act caa cct ggt gac tca tct ctc cga ttg caa aga gca aga cac 3458 Ser Thr Gln Pro Gly Asp Ser Ser Leu Arg Leu Gln Arg Ala Arg His 1135 1140 1145 1150 gcc act ccg cct tca acg cca aag cgg gga tct gct ggg gca cag tat 3506 Ala Thr Pro Pro Ser Thr Pro Lys Arg Gly Ser Ala Gly Ala Gln Tyr 1155 1160 1165 gca att taaggaaaac ccccaaaggc tacaggcgac ctgctgatca ggaaagaatt 3562 Ala Ile tcgctcttgt caagtacatc atccttcatg accactaact ttgtgttttt tttctttcct 3622 ttgttgttct gtttcctatt ttgccaggaa gtatttccat agttgctgag aatcaaagca 3682 caaaagaaat ccctacctat gtaaatgttt gaatggagga cgccagtaaa aaaacaaaaa 3742 caaaaacaaa acaaaacata aaatataaac aatcaaaatc caaacaaaca aacaaacact 3802 cactgcatcg ggacttttta attcttcaga cacagacaac aagggttttt agctttaagc 3862 ctgtgcatgt ggacaatact ctgagaacat gtctggaggg gcagtgtaca ggtgctctat 3922 tttaatggaa aacactcccc tctccctctt tcttcttctc tctctttttt ctgatcggtc 3982 gtgtttgtag aaaattcata acatatatag gccaaggaaa tctgcatgta tttttggaaa 4042 tattcttggc tctagattta tcagctattt tagcattaaa ggctgatggg tggattagct 4102 accccagctc ctttcataag acacaaagac gtgcacagga gtttgaaacc ctgagctgtt 4162 tctctgttcc actttccata ctgtcatttc ccttctaagt tagctttggt agctctgttg 4222 gtcagcagtg gccacaaggc tgctttctgc actctctgtg gccacaggaa caaagatgcg 4282 agttgaagat ccctttgtgc cggtagaata aggaaggaag ggaggcaggg aggcaaagca 4342 tgtcagaaac ggaccttggt ttcctgttca cctctttcca ccatgattgc ctccccttgc 4402 agctttccct actcctgccc caactgcagt aggaaatgga atcccattag ttacccattg 4462 tcctgtcttc acatttgttt gtcatccatg tgtgaccatg atctgttgat atctttgaat 4522 ctcttccctc caaaccccaa gtaagttgtc cttcaactgt tctcagtttt tcttcccaaa 4582 catattgcta gattctggac attaaccctg acgtttccta aatactggcc tggcctgtgc 4642 cgcccctgcc cccactactc acaagtttct gagccctttg tcagttttgt acattcccaa 4702 gcatgccagc ttctgtgccc atggaccttg ctgcatcctg cacagcaggg ttcagtttca 4762 tgtatctttg tcttttcctc tagaacctgc ctttttcaaa tatccctgcc ttttcccacc 4822 tggaaacata tactgtatgc gaaagagttg atatatgcaa agtaattcaa acctggccat 4882 gtactttgga aaagaaagca tagagatgaa ttgtggtgtt ctcacacctt aacgaaaatc 4942 tcgtgcgatc tgattccaga agattcttgt gagaaatgtt ttgaatgtgt gacaatttcg 5002 cagggcaatt tcctcctgtg acggttgctg ttacctggta tttccactct cacagtagaa 5062 tagaaatgtt tgtgaaataa aactgatttt aactggaagg agaaacaggt taatggactt 5122 tgtgtttaag agtgtcaaac agtctgagag aataaatggg ggtcttgtct tacattaggg 5182 tgagagtttg attatttagg atgatcccag tgatttcatg tgtgctctct gacctctgag 5242 tatcatagta tcattaaagt aaagatttaa gtctgtataa aggaggagag ttactgattg 5302 gcaaaaaaga ttggcacaaa gcatgaagaa accccatttt tcccagggta atcatgaaag 5362 aaggctcaga gaaagaggga aacaaaagcc tgttcagcag aggccccttt agtattatgg 5422 actgggcaaa gcccactata aactatagga gaaagaagtt ctgataaacc tcttagtatg 5482 ctcagcctcc ttctttgcta tgtccctaag ccaacagggt ataacacagg ccctggtgat 5542 aatgagggtg tcctaaagat cctgatgtcc atgacttcat gtagttcagg cacagaaaga 5602 atagcaaagc tgctgtcaga agtttgtcag gcagagatgt ttatggttga ggatacccca 5662 tatgaatctg ggaaatgggc taggcctgag gcagtcacct atttgtagaa ctcaattgat 5722 ccccaacctg ccaggcttcc tatctcatag gaaacatcct cattgatcct cttcagttgg 5782 agtctcccaa atttaatgtg gaagacaaag tgttggaaaa tcaaaagagg ctcactcaaa 5842 ccagcacagg gagtcctcag agttgcagtt caattcattt taattagaaa agtatgaaag 5902 aggtataaca ctcttattct agcacagtgc cttgcctaga gtagtagtat gaaaactatt 5962 tgttgaagaa ataaatgaat ggaaagacat agggaagaat gggctgtaaa tcctaattaa 6022 aaatagaggt cgacatgaga tacctggcat tttgggaagt gaccaaagat ggccagctag 6082 agattcagca tctccaggat cctcatgtgc ctctcctcaa agctccctcc catctgtagg 6142 agattagtga ggcaaggtgc tgctcagaga ggaggacctc attgttctta ggaccttggc 6202 cagttgttct caaagtgtgg tacctggatc agcagtatca ctatcacttg ggctagttag 6262 agatccagat tctggatcca taccccatac ctactgaatt agaaactcta ggggatgaat 6322 agtggtgttc tctgttgcta agtgacagtg ggcccagcca cctattttaa caagccctcc 6382 cagtgattgt gatacatact aaagtttgag aaccattgta ctaggccatt ccagctgaat 6442 ctcaaacaga aggcagtaat gagagcctac aaatgggagg gacctaagtg cctacctact 6502 cgctaatcgc aggtgcaaac acacaaggag tttggtgggc ttaaggtcag aggagtgtgt 6562 agggagggat gtatgtggaa ggtaagattc agggcaagct aaaaatccga tactgcaacg 6622 ttttccaaaa tcccagaagg caaactgtgc atgttctacc ctgaaccacc caagcaacac 6682 tttctacctt gccttatttt taattggatt cactgtccaa aatgcagagg tttgctttgc 6742 ttttttttca gaagttccaa acagcaactt tgagagcagt ggggtgcttg gcagctgttc 6802 tgtgttttcc aggaatccaa ctgagcattg aaatctctca tttgccgact tatttttata 6862 ggaagccaat taaaaaaaaa aaaagttttc ttatagtatt ggaactactt ctaattttaa 6922 aatgactttt ttgatgtatt ttttgttaaa tactatgtag tgtaatgtat aattgctctt 6982 gtttattgct tttacaatca tatttattaa acagataatg tctctaaagt ctttgcctca 7042 ggtatttttt tttttaatcc taaacccttg gtgttcattc taaatataga agtgttgcat 7102 gtataggatt tcataaaggc taattgcata agaaagagta aacaccacag gcttgaggtt 7162 tttggctgtt ttttactaac aaggcagaat gtatgtacta cctgaattct acctgcattt 7222 caattaacta tacaatgtct gtttattaaa ttactttgat ttaaaaatta 7272 4 1168 PRT Human 4 Met Gly Lys Val Gly Ala Gly Gly Gly Ser Gln Ala Arg Leu Ser Ala 1 5 10 15 Leu Leu Ala Gly Ala Gly Leu Leu Ile Leu Cys Ala Pro Gly Val Cys 20 25 30 Gly Gly Gly Ser Cys Cys Pro Ser Pro His Pro Ser Ser Ala Pro Arg 35 40 45 Ser Ala Ser Thr Pro Arg Gly Phe Ser His Gln Gly Arg Pro Gly Arg 50 55 60 Ala Pro Ala Thr Pro Leu Pro Leu Val Val Arg Pro Leu Phe Ser Val 65 70 75 80 Ala Pro Gly Asp Arg Ala Leu Ser Leu Glu Arg Ala Arg Gly Thr Gly 85 90 95 Ala Ser Met Ala Val Ala Ala Arg Ser Gly Arg Arg Arg Arg Ser Gly 100 105 110 Ala Asp Gln Glu Lys Ala Glu Arg Gly Glu Gly Ala Ser Arg Ser Pro 115 120 125 Arg Gly Val Leu Arg Asp Gly Gly Gln Gln Glu Pro Gly Thr Arg Glu 130 135 140 Arg Asp Pro Asp Lys Ala Thr Arg Phe Arg Met Glu Glu Leu Arg Leu 145 150 155 160 Thr Ser Thr Thr Phe Ala Leu Thr Gly Asp Ser Ala His Asn Gln Ala 165 170 175 Met Val His Trp Ser Gly His Asn Ser Ser Val Ile Leu Ile Leu Thr 180 185 190 Lys Leu Tyr Asp Tyr Asn Leu Gly Ser Ile Thr Glu Ser Ser Leu Trp 195 200 205 Arg Ser Thr Asp Tyr Gly Thr Thr Tyr Glu Lys Leu Asn Asp Lys Val 210 215 220 Gly Leu Lys Thr Ile Leu Gly Tyr Leu Tyr Val Cys Pro Thr Asn Lys 225 230 235 240 Arg Lys Ile Met Leu Leu Thr Asp Pro Glu Ile Glu Ser Ser Leu Leu 245 250 255 Ile Ser Ser Asp Glu Gly Ala Thr Tyr Gln Lys Tyr Arg Leu Asn Phe 260 265 270 Tyr Ile Gln Ser Leu Leu Phe His Pro Lys Gln Glu Asp Trp Ile Leu 275 280 285 Ala Tyr Ser Gln Asp Gln Lys Leu Tyr Ser Ser Ala Glu Phe Gly Arg 290 295 300 Arg Trp Gln Leu Ile Gln Glu Gly Val Val Pro Asn Arg Phe Tyr Trp 305 310 315 320 Ser Val Met Gly Ser Asn Lys Glu Pro Asp Leu Val His Leu Glu Ala 325 330 335 Arg Thr Val Asp Gly His Ser His Tyr Leu Thr Cys Arg Met Gln Asn 340 345 350 Cys Thr Glu Ala Asn Arg Asn Gln Pro Phe Pro Gly Tyr Ile Asp Pro 355 360 365 Asp Ser Leu Ile Val Gln Asp His Tyr Val Phe Val Gln Leu Thr Ser 370 375 380 Gly Gly Arg Pro His Tyr Tyr Val Ser Tyr Arg Arg Asn Ala Phe Ala 385 390 395 400 Gln Met Lys Leu Pro Lys Tyr Ala Leu Pro Lys Asp Met His Val Ile 405 410 415 Ser Thr Asp Glu Asn Gln Val Phe Ala Ala Val Gln Glu Trp Asn Gln 420 425 430 Asn Asp Thr Tyr Asn Leu Tyr Ile Ser Asp Thr Arg Gly Val Tyr Phe 435 440 445 Thr Leu Ala Leu Glu Asn Val Gln Ser Ser Arg Gly Pro Glu Gly Asn 450 455 460 Ile Met Ile Asp Leu Tyr Glu Val Ala Gly Ile Lys Gly Met Phe Leu 465 470 475 480 Ala Asn Lys Lys Ile Asp Asn Gln Val Lys Thr Phe Ile Thr Tyr Asn 485 490 495 Lys Gly Arg Asp Trp Arg Leu Leu Gln Ala Pro Asp Thr Asp Leu Arg 500 505 510 Gly Asp Pro Val His Cys Leu Leu Pro Tyr Cys Ser Leu His Leu His 515 520 525 Leu Lys Val Ser Glu Asn Pro Tyr Thr Ser Gly Ile Ile Ala Ser Lys 530 535 540 Asp Thr Ala Pro Ser Ile Ile Val Ala Ser Gly Asn Ile Gly Ser Glu 545 550 555 560 Leu Ser Asp Thr Asp Ile Ser Met Phe Val Ser Ser Asp Ala Gly Asn 565 570 575 Thr Trp Arg Gln Ile Phe Glu Glu Glu His Ser Val Leu Tyr Leu Asp 580 585 590 Gln Gly Gly Val Leu Val Ala Met Lys His Thr Ser Leu Pro Ile Arg 595 600 605 His Leu Trp Leu Ser Phe Asp Glu Gly Arg Ser Trp Ser Lys Tyr Ser 610 615 620 Phe Thr Ser Ile Pro Leu Phe Val Asp Gly Val Leu Gly Glu Pro Gly 625 630 635 640 Glu Glu Thr Leu Ile Met Thr Val Phe Gly His Phe Ser His Arg Ser 645 650 655 Glu Trp Gln Leu Val Lys Val Asp Tyr Lys Ser Ile Phe Asp Arg Arg 660 665 670 Cys Ala Glu Glu Asp Tyr Arg Pro Trp Gln Leu His Ser Gln Gly Glu 675 680 685 Ala Cys Ile Met Gly Ala Lys Arg Ile Tyr Lys Lys Arg Lys Ser Glu 690 695 700 Arg Lys Cys Met Gln Gly Lys Tyr Ala Gly Ala Met Glu Ser Glu Pro 705 710 715 720 Cys Val Cys Thr Glu Ala Asp Phe Asp Cys Asp Tyr Gly Tyr Glu Arg 725 730 735 His Ser Asn Gly Gln Cys Leu Pro Ala Phe Trp Phe Asn Pro Ser Ser 740 745 750 Leu Ser Lys Asp Cys Ser Leu Gly Gln Ser Tyr Leu Asn Ser Thr Gly 755 760 765 Tyr Arg Lys Val Val Ser Asn Asn Cys Thr Asp Gly Val Arg Glu Gln 770 775 780 Tyr Thr Ala Lys Pro Gln Lys Cys Pro Gly Lys Ala Pro Arg Gly Leu 785 790 795 800 Arg Ile Val Thr Ala Asp Gly Lys Leu Thr Ala Glu Gln Gly His Asn 805 810 815 Val Thr Leu Met Val Gln Leu Glu Glu Gly Asp Val Gln Arg Thr Leu 820 825 830 Ile Gln Val Asp Phe Gly Asp Gly Ile Ala Val Ser Tyr Val Asn Leu 835 840 845 Ser Ser Met Glu Asp Gly Ile Lys His Val Tyr Gln Asn Val Gly Ile 850 855 860 Phe Arg Val Thr Val Gln Val Asp Asn Ser Leu Gly Ser Asp Ser Ala 865 870 875 880 Val Leu Tyr Leu His Val Thr Cys Pro Leu Glu His Val His Leu Ser 885 890 895 Leu Pro Phe Val Thr Thr Lys Asn Lys Glu Val Asn Ala Thr Ala Val 900 905 910 Leu Trp Pro Ser Gln Val Gly Thr Leu Thr Tyr Val Trp Trp Tyr Gly 915 920 925 Asn Asn Thr Glu Pro Leu Ile Thr Leu Glu Gly Ser Ile Ser Phe Arg 930 935 940 Phe Thr Ser Glu Gly Met Asn Thr Ile Thr Val Gln Val Ser Ala Gly 945 950 955 960 Asn Ala Ile Leu Gln Asp Thr Lys Thr Ile Ala Val Tyr Glu Glu Phe 965 970 975 Arg Ser Leu Arg Leu Ser Phe Ser Pro Asn Leu Asp Asp Tyr Asn Pro 980 985 990 Asp Ile Pro Glu Trp Arg Arg Asp Ile Gly Arg Val Ile Lys Lys Ser 995 1000 1005 Leu Val Glu Ala Thr Gly Val Pro Gly Gln His Ile Leu Val Ala Val 1010 1015 1020 Leu Pro Gly Leu Pro Thr Thr Ala Glu Leu Phe Val Leu Pro Tyr Gln 1025 1030 1035 1040 Asp Pro Ala Gly Glu Asn Lys Arg Ser Thr Asp Asp Leu Glu Gln Ile 1045 1050 1055 Ser Glu Leu Leu Ile His Thr Leu Asn Gln Asn Ser Val His Phe Glu 1060 1065 1070 Leu Lys Pro Gly Val Arg Val Leu Val His Ala Ala His Leu Thr Ala 1075 1080 1085 Ala Pro Leu Val Asp Leu Thr Pro Thr His Ser Gly Ser Ala Met Leu 1090 1095 1100 Met Leu Leu Ser Val Val Phe Val Gly Leu Ala Val Phe Val Ile Tyr 1105 1110 1115 1120 Lys Phe Lys Arg Arg Val Ala Leu Pro Ser Pro Pro Ser Pro Ser Thr 1125 1130 1135 Gln Pro Gly Asp Ser Ser Leu Arg Leu Gln Arg Ala Arg His Ala Thr 1140 1145 1150 Pro Pro Ser Thr Pro Lys Arg Gly Ser Ala Gly Ala Gln Tyr Ala Ile 1155 1160 1165 

We claim:
 1. A method of assessing whether a human subject is susceptible to type 2 diabetes comprising the step of determining the allele in the genome of that subject of the SorCS1 or SorCS3 gene.
 2. A method of assessing whether a human subject is susceptible to type 2 diabetes comprising the step of analyzing the nucleic acid sequence of the subject in the SorCS1 or SorCS3 gene.
 3. A method for determining whether a human being is a candidate for developing type 2 diabetes, the method comprising the steps of: determining the sequence of the protein coding region of the SorCS1 or SorCS3 gene of the human being; deducing the amino acid sequence encoded by the region sequenced; and comparing the amino acid sequence to SEQ ID NO:2 or SEQ ID NO:4, respectively, wherein a difference observed indicates the human being as a candidate for developing type 2 diabetes.
 4. A method for determining whether a human being is a candidate for developing type 2 diabetes, the method comprising the step of: determining the mRNA or protein expression level of either SorCS1 or SorCS3 in the human being wherein the expression in comparison to normal range level of expression established by type 2 diabetes-free individuals indicates that the human being is a candidate for developing diabetes.
 5. A method for identifying an agent that interacts with SORCS 1 protein, the method comprising the steps of: exposing a SORCS 1 protein to a test agent; and determining whether the test agent binds to the SORCS 1 protein.
 6. The method of claim 5, wherein the SORCS 1 protein is from a human, a mouse or a rat.
 7. A method for preventing or treating type 2 diabetes in a human being, the method comprising the step of administering neurotensin to the human being in an amount sufficient to prevent or treat type 2 diabetes.
 8. A method for identifying a therapeutic agent, or analog thereof, which is useful for the treatment of type 2 diabetes and related diseases, the method comprising the steps of: exposing a SORCS 1 protein to a test agent; and determining whether the test agent modulates the biological activity of SORCS 1 protein. 